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Proportional apportionment is the problem of assigning seats to parties 
according to their relative share of votes. Divisor methods are the de-facto 
standard solution, used in many countries. 

In recent literature, there are two algorithms that implement divisor meth¬ 
ods: one by Cheng and Eppstein CE14 has worst-case optimal running time 
but is complex, while the other |Pukl4 is relatively simple and fast in practice 
but does not offer worst-case guarantees. 

We demonstrate that the former algorithm is much slower than the other 
in practice and propose a novel algorithm that avoids the shortcomings of 
both. We investigate the running-time behavior of the three contenders in 
order to determine which is most useful in practice. 


1. Introduction 

The problem of proportional apportionment arises whenever we have a finite supply 
of k indivisible, identical resource units which we have to distribute across n parties 
fairly^ that is according to the proportional share of publicly known and agreed-upon 
values vi^... ^Vn (of the sum V = ^Vi of these values). We elaborate in this section on 
applications of and solutions for this problem. 


^Department of Computer Science, University of Kaiserslautern; {reitzig, wild}@cs.uiii-kl.de 
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1. Introduction 


Apportionment arises naturally in politics. Here are two prominent examples: 

• In a proportional-representation electoral system we have to assign seats in parlia¬ 
ment to political parties according to their share of all votes. 

The resources are seats, and the values are vote counts. 

• In federal states the number of representatives from each component state often 
reflects the population of that state, even though there will typically be at least 
one representative for any state no matter how small it is. 

Resources are again seats, values are the numbers of residents. 

In order to use consistent language throughout this article, we will stick to the first 
metaphor. That is, we assign k seats to parties [l..n] proportionally to their respective 
votes Vi^ and we call k the house size. 

A fair allocation should assign Vi/V seats to party i, where V = vi + + Vn is the 

total vote count of all parties. In case of electoral systems which exclude parties below a 
certain threshold of overall votes from seat allocation altogether, we assume they have 
already been removed from our list of n parties. 

As seats are indivisible, this is only possible if, by chance, all Vi/V are integers; otherwise 
we have to come up with some rounding scheme. This is where apportionment methods 
come into play. The books by Balinski and Young |BY01| and Pukelsheim |Pukl4| give 
comprehensive introductions into the topic with its historical, political and mathematical 
dimensions. 

Mathematically speaking, an apportionment method is a function / : M^q x N ^ Nq that 
maps vote counts v = (ui,..., Vn) and house size A: to a seat allocation s = (si,..., s^) •= 
/(v, k) so that Si ‘ + Sn = k. We interpret s as party i getting Si seats. 

There are many conceivable such methods, but there are at least three natural properties 
one would like apportionment systems to have: 

(PI) Pairwise vote monotonicity: When votes change, / should not take away seats 
from a party that has gained votes while at the same time awarding seats to one 
that has lost votes. 


(P2) House monotonicity: f should not take seats away from any party when the house 
grows (in number of seats) but votes do not change. 

(P3) Quota rule: The number of seats of each party should be its proportional share, 
rounded either up or down. 

Balinski and Young have shown that 

• |(P1) I implies (P2)| [BYOl , Cor. 4.3.1], 

• no method can always guarantee |(P1)| and (P3)| [BYOl , Thm. 6.1], and 


2 












1. Introduction 


Method 

Divisor Sequence 

5(a;) 

Sandwich 

Smallest divisors 

0, 1, 2, 3, ... 

X 

— 

Greatest divisors 

1, 2,3,4,... 

X \ 

— 

Sainte-Lague 

1, 3, 5, 7, ... 

2 x “F 1 

— 

Modified Sainte-Lague 

1.4, 3, 5, 7, ... 

r 2 ai+l x>l 

t 1.6ai+1.4 x<l 

2x + l±l 

Equal Proportions 

0 , A 76, ^, ... 

\/x{x + 1 ) 

x + \^\ 

Harmonic Mean 

r| 4 12 24 

0 , 3 , 3 , Y , . . . 

2x{x-\-l) 

2x+l 

x + \±\ 

Imperiali 

2, 3, 4, 5, ... 

X -\- ‘2 

— 

Danish 

1,4, 7, 10,... 

3x 1 

— 


Table 1: Commonly used divisor methods CE14 Table 1]. For each of the methods, we 
give a possible continuation 6 of the respective divisor s equence (c f. Section 2) 
as well as linear sandwich bounds on 6 , if non-trivial (cf. Lemma 2). 


• |(Pl)| holds exactly for divisor methods [BYOl , Thm. 4.3]. 

Property I (PI) I is essential for upholding the principle of “one-person, one-vote”, an ideal 
pursued by electoral systems around the globe and occasionally enforced by law |Pnkl4 


Section 2.4]. Therefore, divisor a. k. a. Huntington methods can be the only choice, for 
the price of |(P3)[ Other choices can be made, of course; the aforementioned books [BYOl 
Pukl4| discuss different trade-offs. 


Divisor methods are characterized by divisor sequences which control the notion of 
“fairness” implemented by the respective method. There are many popular choices (cf. 


Table 1). It is not per se clear which divisor sequence is the best; there still seems to be 


active discussion, e.g., for the U. S. House of Representatives. One reason is that no-one 
has yet been able to propose a convincing, universally agreed-upon mathematical criterion 
that would single out one method as superior to the others. In fact, there are competing 
notions of fairness, each favoring a different divisor method [BYOl , Section A.3]. A 
reasonable approach is therefore to run computer simulations of different methods and 
compare their outcomes empirically, for example w. r. t. the distribution of final average 
votes per seat vi/si. For this purpose, many apportionments have to be computed, so 
efficient algorithms can become an issue. 

We thus study the problem of computing the final seat allocation by divisor methods 
(given by their divisor sequences) according to vote counts and house size. 

For the case of almost linear divisor sequences, the problem can be solved in time 0(n); 


this has been shown by Cheng and Eppstein CE14 who propose a worst-case running¬ 


time-optimal algorithm which we call ChengEppsteinSelect It is quite involved and 
rather difficult to implement (cf. [Appendix C.3 ). 
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2. Divisor Methods Formalized 


Pukelsheim [Pukl4 , on the other hand, proposes algorithm JumpAndStep whose 
running time is not asymptotically optimal in the worst case but tends to perform well in 
practice, at least if some insight about the used divisor sequence is available and inputs 
are good-natured (cf. [Appendix C.2 ). 


After introducing divisor methods formally in Section 2, we propose a new algorithm in 


Section 3 that also attains the 0(n) worst-case running time bound but is straight-forward 
to implement and efficient in practice as well. It is based on a generalization of our 
solution for the envy-free stick-division problem |RW15b| . 

We finally compare the performance of the three contending algorithms with extensive 
running time experiments, an executive summary of which we give in [Section 4 

Additional material includes an index of notation in Appendix F[ 


2. Divisor Methods Formalized 


Let d = be an arbitrary divisor sequence, i. e. a nonnegative, strictly increasing 

and unbounded sequence of real numbers. We formally set d-i := — oc. 


We require that there is a smooth continuation of d on the reals which is easy to invert. 
That is, we assume a function 5 : M>o ^ l^>do with 

i) 6 is continuous and strictly increasing, 

ii) for X > do can be computed with a constant number of arithmetic operations, 
and 


hi) 6{j) = dj (and thus S ^{dj) = j) for all j G Nq. 


All the divisor sequences used in practice fulfill these requirements; cf. [Table 1[ For 
convenience, we continue on the complete real line requiring 


iv) 6 ^{x) G [—1,0) for X < do. 


Corollary 1: Assuming [i}| to [7v^ S ^(x) is continuous and strictly increasing on M>dQ. 
Furthermore, it is the inverse of j i-G dj in the sense that 


[5 ^{x)\ = max{j G Z>_i | dj < x} 


for all X G M. 


□ 


In particular, [6 ^(x)J = j for dj < x < dj^i so the floored 5 ^ is the (zero-based) rank 
function for the set of all dj as long as x > do- 


Note how this reproduces what is called d-rounding in the literature [BYOl 
obtain an efficient way of calculating this function via 6~^. 


Pukl4| ; 


we 
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2. Divisor Methods Formalized 


Now the set of all seat assignments that are valid w. r. t. d is given by BYOl 


n 

5(v, k) — |s G Nq I ^ Si — k A 3a>0. ViG [l..n\. Si G • a)J + {0,1}|. 


1=1 


We call a realization of a proportionality constant a*; intuitively, every seat corresponds 
to roughly i/a* votes. 

An equivalent definition is by the set of possible results of the following algorithm [BYOl 
Prop. 3.3]. 

Algorithm 1: IterativeMethod^(v, fc) : 

Step 1 Initialize 8 = 0^^. 

Step 2 While /c > 0, 

Step 2.1 Determine / = argmin^^^^ dgjvi. 

Step 2.2 Update sj ^ sj + 1 and k ^ k — 1. 

Step 3 Return s. 


We can obtain a proportionality constant |Pnkl4 , 59f] by 

a* = max{ds--i/vi | 1 < z < n}, (1) 

which in turn defines the set 5(v, k). 

Note that we work with dj/vi instead of Vi/dj in the classical literature; Cheng and 
Eppstein |CE14 and we prefer the reciprocals because the case do = 0 then handles 
gracefully and without special treatment. Therefore, our a* is also the reciprocal of the 


proportionality constant as e.g. Pukelsheim Pukl4 defines it, we multiply by a in the 


definition of S and we take the minimum in |IterativeMethod It is important to note 
that the defined set S remains unchanged by this switch. 

Following the notation of Cheng and Eppstein [CE14| , we furthermore define for given 
votes V = (ui , . . . , Vn) G Q>o the sets 


A := j = 0,1,2,...| with 

and their multiset union 




^ y A’ 

1=1 

As we will see later, the relative rank of elements in A turns out to be of interest; we 
therefore define the rank function r(x, A) which denotes the number of elements in 
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3. Fast Apportionment by Rank Selection 


multiset A that are no larger than x, that is 

n 

r{x,A) := |^n(— (X),x]| = \ € A \ ajj < a:}|. 


( 2 ) 


i=l 


We write r{x) instead of r{x^A) when A is clear from context. 

We need two more convenient shorthands: Assuming we have a* < x, we denote with 
Ix ■= {i e . ,n} \vi> do/x) (3) 

the set of parties that can hope for a seat, and with 


A^ 


■■= mI—eA ^ <x\ = \a[^eA ^ <x\ = ^n(-oo,x) (4) 

yAvi Vi j y^{vi Vi j 


the multiset of elements from sequences of these parties that are smaller than x, i. e. 
reasonable candidates for a*. 


3. Fast Apportionment by Rank Selection 


From 0 together with strict monotonicity of d, we obtain immediately that a* = A(^ji^p 
i. e. the kth smallest element of A (counting duplicates) is a suitable proportionality 
constant. This allows us to switch gears from the iteration-based world of Pukelsheim 
[Pukl4 to selection-based algorithms, as previously seen by Cheng and Eppstein [CE14 


Note that even though A is infinite, A(^k) always exists because the terms aij = dj/y. are 
strictly increasing in j for all z G {1,..., n}. 


Borrowing terminology from the field of mathematical optimization, we call a feasible if 
r(a) > /c, otherwise it is infeasible. Eeasible a A cl'' are called suboptimal. Our goal is to 
find a subset of A that contains a* but as few infeasible or suboptimal a as possible; we 
can then apply a rank-selection algorithm on this subset and obtain (via a*) the solution 
to the apportionment problem. 


Now since d is unbounded, setting any upper bound x on the yields a finite search 
space A^. By choosing any such bound that maintains \A^\ > /c, we retain the property 
that a* is the kth smallest element under consideration. 


One naive way is to make sure that the party with the most votes (which should get 
the most seats) contributes at least k values to A. This can be achieved by letting 
X = dk-i /maxv -h 6: (cf. the proof of Theorem 3[ ). This alone, however, leads only 
to an algorithm with worst-case running time in 0(A:n), which is worse than even 
IterativeMethod] (with priority queues). 


We can actually not improve this upper bound x; it is tight for the case that one party 
has many more votes than all others and gets (almost) all of the seats. We can, however. 
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3. Fast Apportionment by Rank Selection 


exclude many individual elements in because they are too small to be feasible or too 
large to be optimal. 

Towards finding suitable upper and lower bounds on a*, we investigate its rank in the 
multiset A of all candidates. All we know is that 


k < r(a*) < k -\- |/^ 


since we may have any number between one and |/^| parties that tie for the last seat. 
We can still make an ansatz w it h r{a) > k -\- |/^| and r(a) < /c, express rank function r 
in terms of 6~^ (cf. Lemma 4 in Appendix B) and derive that 


y S ^{vi - a) < k- \I^\ 


and 




S ^{vi ■ a) > k. 


(5) 


This pair of inequalities is indeed a sufficient condition for admissible pairs of bounds 
(a, a); we can conclude that a < a* < a. For a formal proof, see 

We now want to derive a sandwich on a* by fulfilling the inequalities in Q as tightly as 
possible. Depending on 5“^, this may be hard to do analytically. However, we can make 
the same assumption as Cheng and Eppstein [CE14| and explicitly compute suitable 
bounds for divisor sequences which behave roughly linearly. This does not limit the scope 
of our investigation by much; see [Appendix A| for more on this. 

Lemma 2: Assume the continuation 6 of divisor sequence d fulfills 


Lemma ^ in [Appendix B 


ax + ^ < 6{x) < ax + P 


for all X G M>o with cr > 0, /3 G [0, a] and fd > 0. Let further some x > a* be given. 
Then, the pair (a, a) defined by 


a := max 




ak — {a — /3) ■ \Ix\ 

u 


and a := 


ak + ^ ■ \Ix\ 
Vx 


with Vx := Eiei^ Vi fulfills the conditions of| 
|^n[a, a]| < 2^1 H-— 


Lemma 5 


that is a < a* < a. Moreover, 


The proof consists mostly of rote calculation towards applying [Lemma 5 
for the details. 


see 


[Appendix B 


We have now derived our main improvement over the work by Cheng and Eppstein 
(CET4| ; where they have only a one-sided bound on a* and thus have to employ an 
involved search on A, we have sandwiched a* from both sides, and so tightly that the 
remaining search space is small enough for a simple rank selection to be efficient. 
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3. Fast Apportionment by Rank Selection 


Building on the bounds from Lemma 2, we can improve upon the naive idea using only 
X by excluding also many more elements from A which are for sure not a*. Since we 
remove in particular too small elements, this means that we also have to modify the rank 
we select; we will see that our bounds are chosen so that we can use 6~^ to count the 
number of elements we discard exactly. 

Recall that we assume a fixed apportionment scheme, that is fixed d with known a, (3 


and /3 as per Lemma 2 


Algorithm 2: SANDtvlCHSELECT(v, k)d : 

Step 1 Find the = maxj'L’i,..., Vn}. 

Step 2 Set x := + 6: for suitabl^ constant 6: > 0. 

Step 3 Compute as per (§. 

Step 4 Compute a and a as per [Lemma 2 


Step 5 Initialize A := 0 and k k. 

Step 6 For all i G /^, do: 

Step 6.1 Compute j := max{0, \6~^{vi - a)]} and j := -a)]. 

Step 6.2 Add all dj/vi to A for which j < j < j- 
Step 6.3 Update k k — j. 


Step 7 Select and return A 


Ck)' 


Theorem 3 

Algorithm 2 computes a* in time 0(n) for any divisor sequence d that fulfills the 


requirements of Lemma 2 


Proof: First, we have to show that as we compute it in Steps Si is correct. We 
have X > a* = A^ as already r(x — s) = r(dk-i/v^^^) > fc; at least the k elements 
... , G A are no larger than dk-i/v^^"^. We thus never need to consider elements 
a > X, and in particular A(^k) — = A H (—oc, x). 

So far, we have needed no additional restriction on 6: in [Step ^ we only need it to be 
positive so we do not discard a* by accident if it is exactly dk-i/v^^"^. However, the size 
of A^ can be arbitrarily large - depending on the input values Vi which we do not want. 
Therefore, we require 


0 < e < 


dk — dk-i 

■ 


( 6 ) 


^Neither correctness nor 0-running-tinie is affected by the choice of e here since it affects only the size 
of which is bounded by n in any case. In particular, the size of A is affected only up to a constant 


factor. For tweaking performance in practice, see the proof of Theorem 3 
















4. Comparison of Algorithms 


such exists because d is strictly increasing. Note how then x < so we do not keep 

any additional suboptimal values. 


From 
that is 

A = 


Step 4 on, we then construct multiset ^ C ^ as the subsequent union of Ai H [a, a ], 

i(i) <j < J(i)| 


ieix 


y |— 

i±i {— 

^ I Vi 


■ r 


5 ^{vi ■ a) <j <5 ^{vi ■ a) I 

Vi ■ a < dj < Vi -d ] 


^ * 

A n [a, a]. 


Vi 


a < ^ < a 


In particular, the last step follows fror n Q wi th x > a*. By Lemma 2, we know that 
a < a* < a for the bounds computed in Step 4, so we get in particular that a* G A. 


It remains to show that we calculate k correctly. Clearly, we discard with • • • ? 


exactly j elements in Step 6.2, that is \Ai H (—oc,a)| = j{i). Therefore, we compute with 


k — k — '^^\Ai n (—oc,a)| = r(a*,^) — \ An (—oc,a)| = r(a*,^) 

ieix 


the correct rank of a* in A. 

For the running time, we observe that the computations in steps to are easily 


done with 0{n) primitive instructions. The loop in Step 6 and therewith steps 6.1 


and 6.3 are executed |/^| < n times. The over all nurn ber of set operations in [Step 6^2 


IS 


\A\ G 0{\lx\) ^ 0{n) (cf. Lemma 2). Finally, Step 7 runs in time 0(|v4|) C 0{n) when 
using a (worst-case) linear-time rank selection algorithm (e.g., the median-of-medians 
algorithm [Blu-h73| ). □ 


We have obtained a relatively simple algorithm that implements many divisor methods 
and has optimal asymptotic running time in the worst case. It remains to be seen if it is 
also efficient in practice. 


4. Comparison of Algorithms 

We have implemented all algorithms mentioned above in Java |RW15a] with a focus on clar¬ 
ity and performance. Reviewing the algorithms resp. implementations (cf. [Appendix C), 
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4. Comparison of Algorithms 


[[ Image Discarded Due To Vtikz/external/mode=list and make^ ]] 


Figure 1: This figure shows average running tinnes of 

?? 


SandwichSelect 


?? 


ChengEpp- 


steinSelect] ??, |JuMPANDSTEp|with naive YY resp. priority-queue YY mini- 


mum selection, and |IterativeMethod| with naive ?? resp. priority-queue ?? 
minimum selection, normalized by the number of parties n. The inputs are 
random apportionment instances with vote counts Vi drawn i. i. d. uniformly from 
[1,3]. The numbers of parties n, house size k and method parameters (g,/ 3) 
have been chosen to resemble national parliaments in Europe (left) and the U.S. 
House of Representatives (right), respectively. 

[[ Image Discarded Due To Vtikz/external/mode=list and make^ ]] 
Figure 2: Running times on individual inputs plotted against |Aa| for 


resp. 1^1 for 


SandwichSelect 


JumpAndStep 


(left) 

(right). Inputs are random with exponentially dis¬ 
tributed Vi for n G {1??,5??,10 ??, 20 ??, 30 ??, 40 ??, 50 ??, 75 ??, 100 ??} • 
10^ and k = 5n: they have been apportioned w. r.t. (c,/3) = (2,l). 


we observe that neither IterativeMethod nor JumpAndStep are asymptotically 


worst-case efficient whereas |ChengEppsteinSelect| does not seem to be practical 
regarding implementability. SandwichSelect does not have either deficiency and is 
still the shortest of the non-trivial algorithms. 


We evaluate relative practical efficiency by performing running time experiments on 
artificial instances; we fix the number of parties n, house size k and the used divisor 
method and draw multiple vote vectors v at random according to different distributions. 
Where possible, we draw votes from a continuous distribution with fixed expectation; 
this ensures that vote proportions do not devolve to trivial situations as n grows. 


In order to keep the parameter space manageable, we use n as free variable and fix k to 
a multiple of n. For ease of implementation, we restrict ourselves to divisor sequences of 
the form {aj + this still allows us to cover a range of relevant divisor methods at 

least approximately (cf. Table 1). We describe the machine configuration used for the 
experiments and further details of the setup in [Appendix D[ 

Figure l| shows the results of two experiments with practical parameter choices. It is clear 
that I Jump AndSte^ dominates the field; of the other algorithms, only [SandwichSelect 
comes close in performance. These observations are stable across many parameter 
choices; see also [Appendix E[ We will therefore restrict ourselves to JumpAndStep and 


SandwichSelect in the sequel. 


Towards understanding what influences the performance of these algorithms the most, 
we have investigated how (the number of s eats JumpAndStep assigns too much, i.e. 
k — ^Si) resp. \A\ (the number of candidates SandwichSelect[ selects from) relate to 
the measured running times. While the connection is clear for SandwichSelect , we 


need to look at cases where Pukelsheim’s estimators are bad; as long as jA^I <C n, the 


Q{n) portions of JumpAndStep dominate. Figure 2 exhibits such a setting 
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5. Conclusion 


[[ Image Discarded Due To Vtikz/external/mode=list and make^ ]] 


Figure 3: Normalized average runtimes of 


SandwichSelect 


(left) and 


JumpAndStep 


(right) on Vi drawn randomly from uniform YY, exponential ??, Poisson YY and 
Pareto ?? distributions, respectively, and with k = bn and {a, (3) = (2,1)- 

[[ Image Discarded Due To Vtikz/external/mode=list and make^ ]] 


SandwichSelect 


?? and 


Figure 4: The left plot shows normalized running times of 

I JumpAndStep] ?? on instances with k = 2n and Pareto-distributed Vi for 
(a,/3) = (1.0,0.001). The right plot shows that the average of jA^I seems to 
converge towards a constant fraction of n in this case. 


While PumpAndStep] is faster than |SandwichSelect| in the experiments of|Figure~T 


and similar ones, we observe that |SandwichSelect| is more robust against changing 
parameters. jFigure 3] exhibits this for switching between different vote distributions: 
the average running times of SandwichSelect are close to each other where those of 


JumpAndStep] spread out quite a bit. It may be noteworthy that each algorithm has 
one “outlier” distribution but they are not the same. 


JumpAndStep] does indeed seem to outperform ]SandwichSelect] consistently so far, 
if not by much in some cases. We have found a parameterization which, even though 


it is admittedly rather artificial, clearly suggests that JumpAndStep does indeed have 


ijj{n) worst-case behavior and that SandwichSelect can be faster; see Figure 4 The 
question after realistic settings for which this is the case remains open. 


In summary, we have seen that SandwichSelect provides good performance in a reliable 
way, i.e., its efficiency does not depend much on divisor sequence or input. On the other 
hand, pUMP A ndStep] is faster on average when good estimators are available, but can 
be slower in certain settings. 


5. Conclusion 

We have derived an algorithm implementing divisor methods of apportionment that is 
worst-case efficient, simple and practicable. As such, it does not have the shortcomings of 
previously known algorithms. Even though it can not usually outperform ] Jump AndSte'pI 
its robustness against changing parameters makes it a viable candidate for use in practice. 
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A. Our Scope of different Methods of Apportionment 


A. Our Scope of different Methods of Apportionment 


As we have seen in [Section 2 there are many possible divisor sequences. For our main 
result (cf. page we follow Cheng and Eppstein |CE14| and require the sequences to 
be “almost” linear; we should check that we do not unduly restrict the scope of our 
investigation. 


We refer to the recent reference work by Pukelsheim |Pnkl4| and, by extension, to 
Balinski and Young [BYOl who classify different divisor methods of apportionment in 
terms of signpost sequences, a concept equivalent to the divisor sequences we use. They 
distinguish these classes of such sequences (cf. |Pukl4 , Sections 3.11-12]): 

• stationary sign-posts of the form s{n) = n — 1 -h r with r G (0,1); 

• power-mean sign-posts defined by 


5p(0) — 0, 


Spin) = 


(n — 1)^ -h 


Vp 


for p ^ —oo, 0, oo; 

• and special cases S-^oin) = n — 1, so(n) = i/(n — l)n, and s^oin) = n. 

It is easy to see that stationary sign-posts correspond do divisor sequences dj — j ^ 
with /3 G (0,1) (up to a shift by one); as such, [Lemma 2| applies immediateley with a — 1 
and jd = (3 — and yields a particularly nice (and tight, for our choices of a and a) 
upper bound on the size of the candidate set A. We cover the special cases as well; see 


Table 1 for the corresponding sandwich bounds. 


As for the remaining power-mean sign-posts, the trivial bounds /3 = 0 and ^ = 1 already 
work. One can apply the power-mean inequality and use the slightly better bounds 
for p G {—oo, —1, 0,1, oo} as given in Table 1[ Even better bounds can be gleaned 
from observing that Spin) converges to n — 1/2 from one side, and quickly so; Sp(l) thus 
determines either /d or ^ and the other can be chosen as 1 / 2 . 


In summary, our algorithm [SandwichSelect applies to all divisor methods treated by 
Pukelsheim |Pukl4| and Balinski and Young BY01| 


B. Lemmata and Proofs 

Lemma 4: For rank function r{x,A), 


r{x,A) = y;[(5 • x)J + 1. 

1=1 
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B. Lemmata and Proofs 


Moreover, for x < x we have 

r{x, ^ [S~^ {vi • x)J + 1 

i£lx 

with Ix = {i ^ {1 ,. • • ^n} \ Vi > rfo/^}* 


B.l. Proof of ILemma ^ 


By |eq. (|2[) on page 6\ it suffices to show that 
\{aij I ciij < ^}| = ’ x)\+l 


for each i G ^n}. Now, if x > = dj/y. for some j, then Vi • x > dj, and so 

• x)\ is the largest index j' for which j/ = dy jvi < x. As dj is zero-based, there 
are j' + 1 > 1 such elements < x and the equation follows. 

Otherwise, that is > x for all j, we have f — \_5~^{vi • x)\ = —1 by 
and the equality holds with 0 on both sides. 

For the second equality, we only have to show that the omitted summands are zero. So 
let i ^ Ix he given, that is Vi < do/x. For x <x, we have 


IV. 


and 


Corollary 1 


.do 

Vi' X < — • X < 

X 


do _ 

— • X 
X 


do, 


and hence [d ^{vi • x)J = — 1 by iv) 


Lemma 5: Let x > a* and assume a and a are chosen so that they fulfill 


S ^{vi ' a) < k — \Ix\ and 6 ^{vi • a) > k. 

i G I-^ i G I-^ 


Then, a < a* < a. 


The lemma follows more or less directly; one uses the sandwich bounds on r to show 
that a < a are infeasible, i.e., r(a) < k, and that a is feasible, and thus all a > a are 
suboptimal since a* is the smallest feasible element in A. 


B.2. Proof of ILemma 5l 


As a direct consequence of Lemma 4 together with the fundamental bounds y— 1 < LyJ < y 
on floors, we find that 


ys ^{Vi-x) < r{x,A} < ^{Vi-x) + l) = \I^\ + ^{Vi-x) (7) 

i G -G" ^ e -G" ^ e 
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B. Lemmata and Proofs 


for any x and all x < x. We can therewith pin down the value of r to an interval of 
width \Ix\ using only 6~^. We can use this to derive upper and lower bounds on a*. 

We show that smaller a are infeasible and larger a are clearly suboptimal, so the optimal 
a* must lie in between. Let us first consider a < a. There are two cases: if there is a 


such that Via 

0 

Al 

we get by strict monotonicity of 5 

r(a) < 

0 

\Ix\ 

+ 

i^Ix 

< 

\Ix\ 

+ 

a) 





< 

k 




and a is infeasible. If otherwise Via < do? he., a < dojvi^ for all i, a must clearly have 
rank r(a) = 0 as it is smaller than any element aij G A. In both cases we found that 
a < a has rank r(a) < k. 

Now consider the upper bound, i. e., we have a > a. In case a > T, we have a > T > a* 
by assumption and any such a cannot be optimal. Otherwise, for a < T, we have 

rCa) > y^S~^(vi-a) > fc, 

so a is feasible. Any element a > a can thus not be the optimal solution a*, which is the 
minimal a with r(a) > k. 


B.3. Proof of ILemma 2l 


We consider the linear divisor sequence continuations 
lij) = Oij + §_ and 5{j) = aj + ]3 
for all j G M>o and start by noting that the inverses are 


S (x) = ^joi - ^0 


and 


§ (x) = x/oL - /5/a 


for X > 5(0) = 13 and x > 5(0) = /3, respectively. For smaller x, we are free to choose the 


value of the continuation from [—1,0) (cf. iv)); noting that ^/a — 3/cx < 0 for x < /3, a 
choice that will turn out convenient is 


t cr cr J 


5 ^{x) := max<-=, —1 > resp. 5 ^{x) := max<-, — 


-‘i- 

t cr cr J 


( 8 ) 


We state the following simple property for reference; it follows from 5(j) < 5(j) < 6{j) 
and the definition of the inverses (recall that [3 < a): 


--- < S-^{x) < S-\x) < 5-\x) < 
a a a a 


for X > 0. 


(9) 
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B. Lemmata and Proofs 


Equipped with these preliminaries, we compute 
ak + (3 \ Ix\ 


a = 




- -^Vi = k + - ■ |/ar|, 
a “ a 

lei^ 


so a 


satisfies the condition of [Lemma 5 Similarly, we find 
ak — {a — P) ' \Ix\ 


a = 
a 


Vx 


a 


Vx = k-{l-^a)-\Ixl 


k = \Ix\ + '^(^^^—-> |/x| + X] ^{vi-a), 

a a/ © ^ 






that is a also fulfills the conditions of Lemma 5 


For the bound on the number of elements falling between a and a, we compute 

|^n[a,a]| = y]|^*n[a,a]| 

i&In; 


= E 

i&liF 


r 

dn ^ 

|i e No 

a < ^ < a> 

Vi ) 


= y] I {j e No I • a < < -y* • a} 

i&Ix 

= E |0 ^ ^0 I S~^{vi - a) <j < S~^{vi ■ a)} 

ieix 

ieix 

< •«) + !) 


ieix 

I IrS 


Vi - a- ^ Vi - a - (3 


a 


a 


-0 


= E(i + 


(3 — (3 Vi - a — Vi - 


iei^ 


a 


+ 


a 


— ( 1 H-) • \Ix\ + [CL — a) - — 

a J a 
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C. Implementing the Algorithms 


= 1 + 


/ 3-/3 


= 2 1 + 


a 

a 


+ 

• \hl 


{a + P — ^ ’ \Ix\ 


a 


C. Implementing the Algorithms 


In this section, we review existing algorithms for divisor methods. In particular, we 
elaborate on how we have implemented them for our experiments |RW15a| , and on 
problems we have encountered in this process. 

We have taken care not to render the algorithm unnecessarily inefficient in order to 
perform a fair comparison of running times; the result is to the best of our abilities 
conditioned on a limited time budget. In particular, all of our implementations have 
been refined on the programming level to roughly the same degree. 

For the purpose of a fair comparison, all implementation have to conform to the same 
interface. 


Parameters: A pair ((a,/3) G with cr > 0 and /3 > 0. 


Input: Votes v and house size k. 


Output: A (symbolic) representation of all seat assignments valid w. r. t. divisor sequence 
{aj + /S)j>o, as well as proportionality constant a*. 

More specifically, the output is encoded as a vector of undisputed seats and a binary 
vector indicating which parties are tied for the remaining seats. We skip the step from 
a* resp. a valid seat assignment to this representation in the pseudo code since it is 
elementary: all parties with “current” resp. “next” value Vijds^-x resp. Vi/dg- equal a* 
are tied. A simple ©(n)-time post-processing identifies these in all cases. 


We have established confidence in the correctness of our implementations by extensive 
random testing [RWI5a , TestMain. java]; every implementation has been run on thou¬ 


sands of instances. The correctness of the results has been confirmed, besides rudimentary 
sanity checks such as matching vector dimensions, by checking Pukelsheim’s Max-Min 
Inequality |PukI4 , Theorem 4.5]. 


All implementations share the same numerical weakness, though: using fixed-precision 
arithmetics, two computations that should lead to the same result (say, a*) yield different 
numbers. We compensate for that by using fuzzy comparisons: we identify numbers if 
they are within some constant e of each other. Thus, we can reliably identify tied parties, 
for instance. 


There is a drawback, though: if distinct values vi/dj are closer than e (or, even without 
the adaption, the resolution of the chosen fixed-precision number representation), we 
may identify them and thus compute wrong seat assignments. 


17 













C. Implementing the Algorithms 


This issue can not be circumvented on the algorithmic level. The only robust resort 
we know of is using arbitrary-precision arithmetics, inevitably slowing down all the 
algorithms. 


C.l. Iterative Divisor Method 


Implementing irTERATlVEMETHob is straight-forward. An implementation using a priority 
queue implementation from the standard library runs in time &(n + klogn). Since we 
expect overhead for the queue to be significant for small n, we also implement a variant 
which determines / using a simple linear scan, resulting in a total running time in Q{kn). 


Shared code aside, [IterativeMethod] takes about 50 resp. 65 lines of code with resp. 
without priority queues. 


C.2. Jump-and-Step 

The jump-and-step algorithm |Pukl4 , Section 4.6] can be formulated using our notation 
as follows: 

Algorithm 3: JuMPANDSTEPrf(v, fc) : 

Step 1 Compute an estimator a for a*. 

Step 2 Initialize Si = - a)] -hi. 


Step 3 Iterate similarly to IterativeMethod until — k with 
arg maxf^i Vi/ds -, T^Si < k; 
argminf^iT^/4,-1, T.Si> k. 


/ = 


The performance of this algorithm clearly depends on '.— ^Si — k after [Step 2[ the 
running time is in 0(n -h |Aa| • logn) when using priority queues for [Step ^ (which may 
not be advisable in practice if jA^j can be expected to be very small). As such, the 
running time is not per se bounded in n and k. 

We follow the recommendations of Pukelsheim and use the estimator |Pukl4 

C, (k + n‘{^/a- 1 / 2 ), 0 <^/a< 1 ; 

^ [ /c -h n • l^/a \, else. 

The first case corresponds to Pukelsheim’s recommended estimator for stationary signpost 
sequences, the second to his good universal estimator generalized to divisor sequences 
that are not signpost sequences in the strict sense. The additional factor a rescales the 
value appropriately; Pukelsheim only considers d = 1. 


Section 6.1] 
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C. Implementing the Algorithms 


Given that these estimators guarantee jA^I < n in the worst case, we can assume 
that JumpAndStep runs in time O(nlogn). Furthermore, Pukelsheim claims that the 
recommended estimator is good in practice in the sense that jA^j G 0(1) in expectation, 
so JumpAndStep may be efficient in practice for large n as well. Since their proof is 
limited to uniformly distributed votes and k ^ oo, we investigate this in Section 4| 

Shared code aside, PumpAndStep takes about 120 lines of code, with or without priority 
queues. 


C.3. The Algorithm of Cheng and Eppstein 


Cheng and Eppstein |CE14] do not give pseudocode for the main procedure of their 
algorithm which would combine the individual steps to compute For the reader’s 

convenience and for clarity concerning our running-time comparisons we give this top-level 
procedure as we have inferred it. 


Algorithm 4: CHENGEppSTElNSELECTrf(v, k) : 

Step 1 Compute a suitable finite representation of A. 

Step 2 C := FindContributingSequences(A, A:). 

Step 3 ^ s-i(fc) ICEMj (3)]. 

Step 4 If r((^, A) > k then 

^ := LowerRankCoarseSolution(A, 

Step 5 Return CoarseToExagt(A,/ c, ^). 


The subroutines are given in sufficient detail in their Algorithms 1 to 3, respectively. 
The pseudo code given uses some high-level set operations which we did not implement 
naively due to performance concerns; we compute several steps during a single iteration 
over the respective sets of sequences. 


Note that we have (hopefully) fixed an off-by-one mistake in the text. The definition of 
rank r(x^A) is, “the number of elements of A less than or equal to x”; that is, the rank 
of A{j) is j + 1 since A is zero-based (the first element is A(0)). However, the authors 
continue to say that r(T, A) “is the index j such that A{j) < x < A{j -h 1).” 


Regarding performance, Cheng and Eppstein show that their algorithm runs in time 0(n) 
in the worst case. Since [ChengEppsteinSelegt computes a linear number of medians 
and requires a linear number of evaluations of rank function r{x^A) (with geometrically 
shrinking \A\ - otherwise the algorithm would not run in linear time), it is unclear 
whether the algorithm is efficient in practice. 


Shared code aside, [ChengEppsteinSelegt] take about 300 lines of code. By this 
measure, it is the most complex of the algorithms we consider. 
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C. Implementing the Algorithms 


Additional Issues with Numerics 


In addition to the concerns expressed above, there are additional numerical issues when 
implementing [ChengEppsteinSelect| using fixed-precision floating-point arithmetics. 
In short, we have to compute certain floors and ceilings of real numbers exactly or we 
may compute a wrong result 

More specifically, we evaluate r(x,M) several times by computing terms of the form 
(cf. Lemma 4). The problem is that the result of is non-integral in 


general, but is integral when the argument evaluates exactly to a dj. With the usual 
floating-point arithmetic the result might be slightly smaller, though. We then erroneously 
round down to the next smaller integer - a critical error! 

In practice, we can add a small constant to the mantissa before taking the floor. This 
constant has to be chosen large enough to cover potential rounding errors, but also 
small enough so as to not change subsequent calculations; [ChengEppsteinSelect may 
compute a wrong answer otherwise. This is a very delicate requirement we do not know 
how to fulfill in general. 


C.4. SandwichSelect 


We already discuss our algorithm at length in Section 3 , Since we want to investigate 
practical performance, we implement rank-selection using average-case efficient Quickselect 
as opposed to using a linear-time algorithm with large constant factors. 


We want to emphasize that our final algorithm |SandwichSelect| is conceptually simple 
in the sense that there is little hidden complexity. We need exactly one call to a rank 
selection algorithm on a linear-size list which takes five additional linear-time operations 
to come up with: finding the maximal value constructing index set /^, computing 1^, 
constructing multiset A and computing k. These are all quite elementary tasks in that 
they use one f or-loop each which run for at most n iterations with only few operations 
in each. We therefore think that we can outperform [ChengEppsteinSelect in practice, 
and should not be far behind [JumpAndStep[ either. 


Regarding implementation, the delicate part was to get the bounds on j (cf. Step 6.1) 
right. We use floor and ceiling functions on real numbers, so rounding errors that occur 
in fixed-precision floating-point arithmetic can cause harm. We can circumvent this by 
adding (subtracting) a conservatively large constant to the mantissa of the floats before 
taking floors (ceilings). If this constant is larger than necessary for covering rounding 
errors, we might add slightly more candidates to A (at most two per party) which 
would slightly degrade performance. Correctness, however, is not affected (in contrast to 


ChengEppsteinSelegt ). 


We also remark here that the code |RW15a for the experimental results discussed in 


Section 4 is based on an earlier version of Lemma 2 with slightly weaker bounds (cf. 
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D. Experimental Setup 


Appendix G). Experiments with the updated code are to follow, and might yield slight 


improvements for |SandwichSelect[ 


Shared code aside, [SandwichSelect| takes about 100 lines of code. By this measure, it 
is the least complex of the non-trivial algorithms we consider. 


D. Experimental Setup 


We have run the experiments with Java 7 on Ubuntu 14.04 LTS running kernel 3.13.0-34- 
generic x86_64 GNU/Linux. The hardware platform is a ThinkPad T430s Tablet with 
the following core parameters according to Ishw. 

CPU: Intel® Cove™ i5-3320M CPU @ 2.60GHz 

Cache: LI 32KiB, L2 256KiB, L3 3MiB 

RAM: 4+4GiB SODIMM DDR3 Synchronous 1600 MHz (0.6 ns) 

As our code is written in Java, we include a warm-up phase to trigger just-in-time 
compilation of the relevant methods. All times are measured using the built-in method 
System.nanoTimeO. We use the same set of inputs for all algorithms, all of which have 
to construct the full set 5(v, k) for each input (v, k) during the measurement. 

In order to increase accuracy, we repeat the execution of each algorithm on each input 
several times and measure the total time; we then report the average time per execution. 

For the selection-based algorithms, we use the randomized Quickselect-based implemen¬ 
tation by Sedgewick and Wayne [SW11| as published on the book website. We use 
the (pseudo) random number generators for several distributions from the same library 
(download of stdlib-package. jar on August 11th, 2015). 

For reproducing our running time experiments, make sure you have working GNU/Linu:x[^ 
installation with Ruby, Java 7 and Ant; then execute 


ruby run_experiments.rb arxiv.experiment 

for the data represented in Section 4 and [Appendix E[ Be warned: this may run for 
long time, and it will create lots of images (provided you have gnuplot installed). 


E. More Running-Time Experiments 

We apologize to only offer draft graphics without commentary for the time being. 


^Our framework may work on other platforms, maybe with small adjustments to the Ruby code, but we 
have not tried. See README. md for a workaround. 
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E, More Running-Time Experiments 


Average normalized runtimes for several input distributions and across several orders of 
magnitudes of n. 
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E. More Running-Time Experiments 


Normalized runtimes of ChengEppsteinSelect for several input distributions and 
across several orders of magnitudes of n. 
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E, More Running-Time Experiments 


Normalized runtimes of JumpAndStep for several input distributions and across several 
orders of magnitudes of n. 
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E, More Running-Time Experiments 


Normalized runtimes of |S ANDWIChSelect for several input distributions and across 
several orders of magnitudes of n. 
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E, More Running-Time Experiments 


Normalized of |JuMPANDSTEP] for several input distributions and across several orders 
of magnitudes of n. 
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E, More Running-Time Experiments 


Normalized \A\ of SandwichSelect for several input distributions and across several 
orders of magnitudes of n. 
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E. More Running-Time Experiments 


Runtimes against of |JuMPANDSTEP for several input distributions and across several 
orders of magnitudes of n. Each color stands for one n. 
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E. More Running-Time Experiments 


Runtimes against \A\ of SandwichSelect for several input distributions and across 
several orders of magnitudes of n. Each color stands for one n. 
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F. Index of Used Notation 


F. Index of Used Notation 

In this section, we collect the notation used in this paper. Some might be seen as 
“standard”, but we think including them here hurts less than a potential misunderstanding 
caused by omitting them. 

Generic Mathematical Notation 


[xj, [x].floor and ceiling functions, as used in |GKP94| . 

.The kth smallest element of (multi)set/vector M (assuming it exists); 

if the elements of M can be written in non-decreasing order, M is given 
by M(i) < M( 2 ) < M(3) < • • •. 


Example: For M = {5, 8, 8, 8,10,10}, we have M(i) = 5, 
M( 2 ) = M( 3 ) = M( 4 ) = 8, and M( 5 ) = = 10. 


.Similar to but denotes the kth largest element. 

X = (xi,..., Xd) ... to emphasize that x is a vector, it is written in bold; 

components of the vector are written in regular type. 

M. .to emphasize that At is a multiset, it is written in calligraphic type. 

M .2 .multiset union; multiplicities add up. 


Notation Specific to the Problem 

party, seat, vote (count), chamber size 


Parties are assigned seats (in parliament), so that the number of seats 
Si that party i is assigned is (roughly) proportional to that party’s vote 
count Vi and the overall number of assigned seats equals the chamber 
size k. 

d = {dj)JLQ .the divisor sequence used in the highest averages method; d must be a 

nonnegative, (strictly) increasing and unbounded sequence. 

h, .a continuation of j ^ dj on the reals and its inverse, both of which can 

be evaluated in constant time. 

n .number of parties in the input. 

V, Ui.V = (ui, ..., Un) G Q>0 5 vote counts of the parties in the input. 

V .the sum ui + • • • + of all vote counts. 

k ./c G N, the number of seats to be assigned; also called house size. 

s, Si .s = (si,..., Sn) G No, the number of seats assigned to the respective 

parties; the result. 

aij . aij := djjvi, the ratio used to define divisor methods; i is the party, j 

is the number of seats i has already been assigned. 
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G. Changelog 


Ai .For party Ai := ^^ 2 , 2 , • • •} is the list of (reciprocals of) 

party i’s ratios. 

a .We use a as a free variable when an arbitrary ai^j is meant. 

A . A := Ai \S ' " Aj^ is the multiset of all averages. 

r{x, A) .the rank of x in A, that is the number of elements in multiset A that 

are no larger than x] r{x) for short if A is clear from context. 

a*.the ratio a* = ai*j* selected for assigning the last (i. e. the kth) seat; 

corresponds to s by Si = r{a'', Ai); a* = A(^k) (cf- Section 2 and 
Section 3). 

X .an upper bound x > a*; we use x = dk-i/vi + e, where e > 0 is a 

suitable constant. 

Ix . Ix •= {i \ Vi > do/x}; the set of parties i whose vote count is large 

enough, so that i. e. so that they contribute to the rank of x in 

A. 

Vx .the sum of the vote counts of all parties in 

A^ .the elements in A that are smaller than x, i. e., ^ H (— 00 , x). 

a^d .lower and upper bounds on candidates a< a <d such that still 

a* G w4 n [a, a]. 


G. Changelog 


The following (substantial) changes have been made from arXiv version 2 to 3. 


Lemma 2 has been strengthened; both a and the upper bound on \A H [a, a]| have 


been improved. Both changes are due to the observation that we could require 
/3 < a without loss of generality. 

Related notation update: (/3,/3) ^ (/S,/3)- 


• We have added Appendix A| in order to clarify that the assumptions we make for 
our main result do restrict the scope of divisor methods we cover by too much. 


31 


















